Journal: bioRxiv
Article Title: Integrating Histology with Spatial Molecular Programs Using a Multimodal Foundation Model
doi: 10.64898/2026.06.01.729028
Figure Lengend Snippet: A . Workflow for predicting spatial gene-expression programs directly from histology images using SQUALL. B . Benchmarking of virtual biomarker prediction on internal and external Xenium5K sections. SQUALL was benchmarked against ST-Net, iSTAR, EGN, EGNv2, and Path2Space on three Xenium5K sections. Performance was evaluated using Pearson correlation on ( Left ) per tissue sections and ( Right ) overall correlations. HCC: hepatocellular carcinoma; OC: ovarian carcinoma; CC: cervical carcinoma. Box plots: center line, median; box limits, upper and lower quartiles; whiskers, 1.5× interquartile range; statistical test: two-sided Mann-Whitney U test. C . Pairwise scatterplots comparing per-gene Pearson correlations between SQUALL (x-axis, one panel per method) and each of six competing methods. D . Bar plot showing reverse ranking results of prediction performance across biomarker subsets. Prediction performance was inverse ranked within each section by average Pearson correlation, with higher rank indicates better performance. E . Representative virtual biomarker predictions. Left to right: original histology images (20× magnification) and virtually profiled expression of selected genes from all competing methods and SQUALL. Rows show ground truth and predictions for MET (HCC, internal), IFGR1 (OC, internal), and STAT1 (CC, external). Scale bar, 1 mm. F . Schematic of SQUALL applied to virtual biomarker profiling directly using cohort-level histology whole-slides images. G . Bar plot showing GSEA results based on SQUALL-virtually profiled bulk gene expression in the TCGA-LIHC cohort. Genes were ranked by coefficients from a multivariable Cox regression model adjusted for age and stage. Pathways with NES > 0 are associated with poorer prognosis, whereas pathways with NES < 0 are associated with better prognosis. NES, normalized enrichment score. H . Spatial plot of representative annotated tumor regions from Chiara et al. Scale bar, 1 mm. I . Ranking plot of hazard ratios for SQUALL-virtually profiled gene expression within tumor regions of the TCGA-CESC cohort. Hazard ratios were estimated using multivariable Cox proportional hazards models adjusted for age and stage. Representative genes are shown, including DNA double-strand break response (blue), DNA repair (orange), innate immune/complement regulation (green), and adaptive immune response (violet) genes. Solid dots indicate genes with FDR < 0.1. J . Ridge plot of gene set enrichment analysis based on SQUALL virtually profiled tumor region biomarkers on TCGA-CESC cohort. NES, normalized enrichment score. K . Hazard ratios for clinical variables and SQUALL-predicted CD8 + T cell signatures in the TCGA-CESC cohort, estimated using univariate Cox proportional hazards models. Dots indicate hazard ratios and bars indicate 95% confidence interval. Statistical significance was assessed using a one-sided Wald test. L . Representative examples of predicted CD8 + T cell infiltration in the TCGA-CESC cohort. Long-term survivor with predicted intertumoral CD8 + T cells (red dashed area) ( Left ). Short-term survivor with CD8 + cells mainly at the tumor margin ( Right ). Scale bar, 5 mm. Statistical significance: * p-value < 0.05, ** p-value < 0.01, *** p-value < 0.001, **** p-value < 0.0001; n.s., not significant.
Article Snippet: Three Xenium5K tissue sections were used for benchmarking ( Table S31 ): one from the SPATCH cohort (hepatocellular carcinoma) and two public datasets from 10x Genomics (ovarian cancer: https://www.10xgenomics.com/cn/datasets/xenium-prime-ffpe-human-ovarian-cancer ; cervical cancer: https://www.10xgenomics.com/cn/datasets/xenium-prime-ffpe-human-cervical-cancer ).
Techniques: Gene Expression, Biomarker Discovery, MANN-WHITNEY, Expressing